Welcome to the May 2020 report from the Reproducible Builds project.
One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. Nonetheless, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into seemingly secure software during the various compilation and distribution processes.
In these reports we outline the most important things that we and the rest of the community have been up to over the past month.
News
The Corona-Warn app that helps trace infection chains of SARS-CoV-2/COVID-19 in Germany had a feature request filed against it that it build reproducibly.
A number of academics from Cornell University have published a paper titled Backstabber s Knife Collection which reviews various open source software supply chain attacks:
Recent years saw a number of supply chain attacks that leverage the increasing use of open source during software development, which is facilitated by dependency managers that automatically resolve, download and install hundreds of open source packages throughout the software life cycle.
In related news, the LineageOS Android distribution announced that a hacker had access to the infrastructure of their servers after exploiting an unpatched vulnerability.
Marcin Jachymiak of the Sia decentralised cloud storage platform posted on their blog that their siac
and siad
utilities can now be built reproducibly:
This means that anyone can recreate the same binaries produced from our official release process. Now anyone can verify that the release binaries were created using the source code we say they were created from. No single person or computer needs to be trusted when producing the binaries now, which greatly reduces the attack surface for Sia users.
Synchronicity is a distributed build system for Rust build artifacts which have been published to crates.io. The goal of Synchronicity is to provide a distributed binary transparency system which is independent of any central operator.
The Comparison of Linux distributions article on Wikipedia now features a Reproducible Builds column indicating whether distributions approach and progress towards achieving reproducible builds.
Distribution work
In Debian this month:
In Alpine Linux, an issue was filed and closed regarding the reproducibility of .apk
packages.
Allan McRae of the ArchLinux project posted their third Reproducible builds progress report to the arch-dev-public
mailing list which includes the following call for help:
We also need help to investigate and fix the packages that fail to reproduce that we have not investigated as of yet.
In openSUSE, Bernhard M. Wiedemann published his monthly Reproducible Builds status update.
Software development
diffoscope
Chris Lamb made the changes listed below to diffoscope, our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. He also prepared and uploaded versions 142
, 143
, 144
, 145
and 146
to Debian, PyPI, etc.
-
Comparison improvements:
- Improve fuzzy matching of JSON files as
file
now supports recognising JSON data. (#106)
- Refactor
.changes
and .buildinfo
handling to show all details (including the GnuPG header and footer components) even when referenced files are not present. (#122)
- Use our
BuildinfoFile
comparator (etc.) regardless of whether the associated files (such as the orig.tar.gz
and the .deb
) are present. [ ]
- Include GnuPG signature data when comparing
.buildinfo
, .changes
, etc. [ ]
- Add support for printing Android APK signatures via
apksigner(1)
. (#121)
- Identify iOS App Zip archive data as
.zip
files. (#116)
- Add support for Apple Xcode
.mobilepovision
files. (#113)
-
Bug fixes:
- Don t print a traceback if we pass a single, missing argument to diffoscope (eg. a JSON diff to re-load). [ ]
- Correct
differences
typo in the ApkFile
handler. (#127)
-
Output improvements:
- Never emit the same
id="foo"
anchor reference twice in the HTML output, otherwise identically-named parts will not be able to linked to via a #foo
anchor. (#120)
- Never emit an empty id anchor either; it is not possible to link to
#
. [ ]
- Don t pretty-print the output when using the
--json
presenter; it will usually be too complicated to be readable by the human anyway. [ ]
- Use the SHA256 over MD5 hash when generating page names for the HTML directory-style presenter. (#124)
-
Reporting improvements:
- Clarify the message when we truncate the number of lines to standard error [ ] and reduce the number of maximum lines printed to 25 as usually the error is obvious by then [ ].
- Print the amount of free space that we have available in our temporary directory as a debugging message. [ ]
- Clarify
Command [ ] failed with exit code
messages to remove duplicate exited with exit
but also to note that diffoscope
is interpreting this as an error. [ ]
- Don t leak the full path of the temporary directory in
Command [ ] exited with 1
messages. (#126)
- Clarify the warning message when we cannot import the
debian
Python module. [ ]
- Don t repeat
stderr from
if both commands emit the same output. [ ]
- Clarify that an external command emits for both files, otherwise it can look like we are repeating itself when, in reality, it is being run twice. [ ]
-
Testsuite improvements:
- Prevent
apksigner
test failures due to lack of binfmt_misc
, eg. on Salsa CI and elsewhere. [ ]
- Drop
.travis.yml
as we use Salsa instead. [ ]
-
Dockerfile
improvements:
- Add a
.dockerignore
file to whitelist files we actually need in our container. (#105)
- Use
ARG
instead of ENV
when setting up the DEBIAN_FRONTEND
environment variable at runtime. (#103)
- Run as a non-root user in container. (#102)
- Install/remove the
build-essential
during build so we can install the recommended packages from Git. [ ]
-
Codebase improvements:
- Bump the officially required version of Python from 3.5 to 3.6. (#117)
- Drop the (default)
shell=False
keyword argument to subprocess.Popen
so that the potentially-unsafe shell=True
is more obvious. [ ]
- Perform string normalisation in Black [ ] and include the Black output in the assertion failure too [ ].
- Inline
MissingFile
s special handling of deb822
to prevent leaking through abstract layers. [ ][ ]
- Allow a bare
try
/except
block when cleaning up temporary files with respect to the flake8
quality assurance tool. [ ]
- Rename
in_dsc_path
to dsc_in_same_dir
to clarify the use of this variable. [ ]
- Abstract out the duplicated parts of the
debian_fallback
class [ ] and add descriptions for the file types. [ ]
- Various commenting and internal documentation improvements. [ ][ ]
- Rename the
Openssl
command class to OpenSSLPKCS7
to accommodate other command names with this prefix. [ ]
-
Misc:
- Rename the
--debugger
command-line argument to --pdb
. [ ]
- Normalise filesystem
stat(2)
birth times (ie. st_birthtime
) in the same way we do with the stat(1)
command s Access:
and Change:
times to fix a nondeterministic build failure in GNU Guix. (#74)
- Ignore case when ordering our file format descriptions. [ ]
- Drop, add and tidy various module imports. [ ][ ][ ][ ]
In addition:
-
Jean-Romain Garnier fixed a general issue where, for example,
LibarchiveMember
s has_same_content
method was called regardless of the underlying type of file. [ ]
-
Daniel Fullmer fixed an issue where some filesystems could only be mounted read-only. (!49)
-
Emanuel Bronshtein provided a patch to prevent a build of the Docker image containing parts of the build s. (#123)
-
Mattia Rizzolo added an entry to
debian/py3dist-overrides
to ensure the rpm-python
module is used in package dependencies (#89) and moved to using the new execute_after_*
and execute_before_*
Debhelper rules [ ].
Chris Lamb also performed a huge overhaul of diffoscope s website:
- Add a completely new design. [ ][ ]
- Dynamically generate our contributor list [ ] and supported file formats [ ] from the main Git repository.
- Add a separate, canonical page for every new release. [ ][ ][ ]
- Generate a latest release section and display that with the corresponding date on the homepage. [ ]
- Add an RSS feed of our releases [ ][ ][ ][ ][ ] and add to Planet Debian [ ].
- Use Jekyll s
absolute_url
and relative_url
where possible [ ][ ] and move a number of configuration variables to _config.yml
[ ][ ].
Upstream patches
The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:
-
Bernhard M. Wiedemann:
-
Jelle van der Waa:
earlyoom
(timestamps in Gzip files)
fmt
(Don t install sphinx-build
cached files as they are unneeded & unreproducible)
nvidia-settings
(timestamp in Gzip files)
-
Chris Lamb:
-
Vagrant Cascadian:
This month s report was written by Bernhard M. Wiedemann, Chris Lamb, Holger Levsen, Jelle van der Waa and Vagrant Cascadian. It was subsequently reviewed by a bunch of Reproducible Builds folks on IRC and the mailing list.